System and method for ecmp load sharing

ABSTRACT

A packet classifier and a method for routing a data packet are provided. The packet classifier includes a content addressable memory, a translation table and a parameter memory. The method includes looking up a content addressable memory for a base address into a parameter memory using a header of the data packet. The base address is related to the routes under ECMP for forwarding the data packet. From among these addresses, using multiple headers of the data packet, an adjustment to the base address is computed. The adjustment specifies an actual address to the parameter memory corresponding to a selected route for forwarding the data packet. The parameter memory is then accessed using the actual address to obtain parameter values relevant to the selected route. The data packet is then forwarded according to the parameter values thus obtained.

CROSS REFERENCE TO RELATED APPLICATIONS

The present application claims priority of U.S. provisional patentapplication No. 60/823,178, filed Aug. 22, 2006, incorporated herein byreference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to routing data packets in a computernetwork. In particular, the present invention relates to routing datapackets in a computer network in which equal cost multi-path (ECMP) loadsharing is available.

2. Discussion of the Related Art

In a router, a data packet is directed from an input port to an outputport based on the data specified in the header of the data packet.Typically, a content addressable memory (CAM) quickly maps the IP headerinformation or layer 3 header of the data packet to a memory location ofa parameter random access memory (PRAM) which stores information thatindicates the route over which the data packet should be forwarded, andthe parameter values relevant to the route. In one prior art router, thedata stored in the PRAM includes a forwarding identifier (FID), types ofservice (TOS), priority, number of copies (e.g., a multicast address)and other information. The circuit including the CAM and the PRAM issometimes referred to as a “packet classifier.” The packet classifierencodes the FID and other data into an internal header for the datapacket which is used by the router to direct the data packet at linespeed to an output port, where the data packet is forwarded to the nextswitch on a path to its destination.

In that prior art router, the CAM entries are mapped one-to-one to thePRAM entries. That is, only one FID can result from the CAM look-upbased on the header of the data packet. In an Internet Protocol (IP)network, a data packet may be routed through any of a number of pathsthrough multiple switches to its destination. ECMP load sharing is onemethod known to those skilled in the art by which multiple-path routingof IP data traffic can be accomplished. One method to support ECMP is toresolve the level 3 (IP layer) and level 4 (“transport control protocol”or TCP layer) headers of a data packet into any one of a number of FIDs,where each FID represents a different output port of the router that isconnected to a switch on a different one of the possible paths for thedata packet. However, because the CAM entries are mapped one-to-one withthe PRAM entries, the prior art router does not provide efficienthardware support for ECMP load sharing.

“JetCore™ Based Chassis System: An Architecture Brief on NetIron,BigIron and FastIron Systems” and “Next Generation Terabit SystemArchitecture: The High Performance Revolution for 10 Gigabit Networks”are white papers available from Foundry Networks, Inc. that disclosedesigns for high performance routers in the prior art. These whitepapers are hereby incorporated by reference in their entireties toprovide background information.

SUMMARY OF THE INVENTION

According to one embodiment of the present invention, a packetclassifier and a method for routing a data packet are provided. Thepacket classifier includes a content addressable memory, a translationtable and a parameter memory. The method includes looking up a contentaddressable memory for a base address into a parameter memory using afirst portion of the data packet. The base address is related to thevarious routes for forwarding the data packet. From among theseaddresses, using a second portion of the data packet, an adjustment tothe base address is computed, the adjustment specifying an actualaddress to the parameter memory corresponding to a selected route forforwarding the data packet. The parameter memory is then accessed usingthe actual address to obtain parameter values relevant to the selectedroute. The data packet is then forwarded according to the parametervalues thus obtained.

The present invention is better understood upon consideration of thedetailed description below, in conjunction with the accompany drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows packet classifier 100 that provides translation table 102between CAM 101 and PRAM 103, according to one embodiment of the presentinvention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention provides hardware support for ECMP by an indirectcoupling between the CAM and the PRAM. (While the detailed descriptionherein uses a CAM in one example to illustrate one way to implement aparameter look-up or search function in a packet classifier, otherimplementations are possible. Other memory elements, such as dynamicrandom access memories (DRAM) or static random access memories (SRAM)may also be used). According to one embodiment of the present invention,FIG. 1 shows a packet classifier 100 that provides translation table 102between CAM 101 and PRAM 103. In the embodiment of FIG. 1, the header ofa data packet is used to perform a route look-up in CAM 101 to obtain anindex (“route lookup result”) into translation table (or “re-mappingRAM”) 102, which provides a base address to PRAM 103. This base addressis then modified to obtain the address of one of a number of entries inPRAM 103. Each of these entries contains a different FID pointing to adifferent output port representing the next hop in a different ECMProute.

According to the embodiment of the present invention shown in FIG. 1,CAM 101 is logically organized as a 256 k×72×3-bit memory, whichprovides a 21-bit index (TCAM_INDEX) into translation table 102.Translation table 102 is logically organized into 2 M×32-bit memory. The32-bit output datum read from translation table 102 includes a baseindex to PRAM 103. In one embodiment, the 31-bit output datum includes a19-bit base index (“PRAM_INDEX_BASE”), a 4-bit field “ECMP_MASK” andmanagement information (e.g., an aging bit). The ECMP_MASK field mayencode the number of ECMP paths available for that data packet. The19-bit base index is combined with a hash function of the IP and TCPheaders of the data packet to obtain an actual 19-bit index into PRAM103. Each entry of PRAM 103 contains an FID and other informationrelevant to the routing of the data packet. Mathematically, the 32-bitoutput datum CAM2PRAM_DATA is given by

CAM2PRAM_DATA[31:0]=READ (TCAM_INDEX[20:0])

where the function READ( )represents the data read from translationtable 102 using the 21-bit TCAM_INDEX as address into translation table102.

According to one embodiment of the present invention, an 8-bit ECMPindex is obtained by a hash function using the MAC, IP and TCP headersof the data packet, and a random number. The portions of the MAC, IP andTCP headers that are used may be, for example, the source anddestination MAC addresses, the IP source and destination addresses andthe TCP source and destination port number. The random number is anumber generated at the initialization process of the router. The randomnumber provides a personalization for the router (i.e., from the samedata packet, a different router would map to a different address in thePRAM). One advantage of using the Random_Number is a more distributedload balance in situations where multiple routers are connected in ahierarchical fashion. Mathematically, the 8-bit ECMP index is given by:

ECMP_INDEX[7:0]=hash(L2, L3, L4, Random_Number)

where hash is a hash function with low collision probability, L2, L3 andL4 represent selected fields in the MAC, IP and TCP headers of the datapacket, and Random_Number is the personality random number of therouter.

In one embodiment, the ECMP_INDEX is mapped into one of possible theroutes (up to a maximum of 16, i.e., a 5-bit number). As mentionedabove, the number of possible routes is encoded in the ECMP_MASK field.The number of possible routes (“ECMP_BASE”) is given by adding one tothe 4-bit ECMP_MASK field:

ECMP_BASE[4:0]=ECMP_MASK[3:0]+4′h1;

Using the ECMP_INDEX and the ECMP_BASE, a 4-bit offset to the 19-bitbase index to PRAM 103 (“ECMPADJUST) is obtained using the modulofunction (i.e., taking the remainder from an integer divide ofECMP_INDEX by ECMP_BASE):

ECMP_ADJUST[3:0]=(ECMP_INDEX[7:0]% ECMP_BASE[4:0]);

The modulo operation provides close to perfect load balancing acrossequal cost paths. The 4-bit offset ECMP_ADJUST is logically OR'd withthe 19-bit base index PRAM_INDEX_BASE to obtain the actual index intoPRAM 103:

PRAM_INDEX_TO_USE=PRAM_INDEX_BASE|ECMP_ADJUST[3:0]

Using this actual index, the parameter values and the FID are obtainedfrom PRAM 103 to perform forwarding of the data packet.

In this manner, hardware support for load balanced ECMP computations isprovided without requiring a line card or central processing unitmanagement intervention. ECMP traffic can therefore be processed at linerate. Further, increasing the number of routes handled by the CAM forECMP can be achieved without a corresponding increase in PRAM size, asthe PRAM entries can be shared for the same next hop, thereby providingsignificant cost savings. Because software controls the many-to-one orone-to many mappings between route lookups of CAM 101 and the associatedpram entries in PRAM 103, the present invention allows certainstatistics to be collected (e.g., for route groups).

The detailed description above is provided to illustrate specificembodiments of the present invention and is not intended to be limiting.Numerous variations and modifications within the scope of the presentinvention are possible. The present invention is set forth in thefollowing claims.

1. A network device comprising: at least one port configured to receivea data packet; and a first memory and a second memory, wherein thenetwork device is configured to: retrieve a first entry in the firstmemory based on a portion of a first received data packet; retrieve afirst entry in the second memory based on the first entry in the firstmemory, the first entry in the second memory including informationassociated with a route over which the first received data packet shouldbe forwarded; retrieve the first entry in the first memory based on aportion of a second received data packet; and retrieve a second entry inthe second memory based on the first entry in the first memory, thesecond entry in the second memory being distinct from the first entry inthe second memory, the second entry in the second memory includinginformation associated with a route over which the second received datapacket should be forwarded, wherein the portion of the first receiveddata packet and the portion of the second received data packet aresubstantially the same.
 2. The network device of claim 1 wherein thefirst memory is a content addressable memory (CAM) and wherein thesecond memory is a parameter random access memory (PRAM).